Prior Information Based Bayesian Infinite Mixture Model
نویسندگان
چکیده
Unsupervised learning methods have been tremendously successful in extracting knowledge from genomics data generated by high throughput experimental assays. However, analysis of each dataset in isolation without incorporating potentially informative prior knowledge is limiting the utility of such procedures. Here we present a novel probabilistic model and computational algorithm for semi-supervised learning from genomics data. The probabilistic model is an extension of the Bayesian semiparametric Gaussian Infinite Mixture Model (GIMM) and training of model parameters is performed using Markov Chain Monte Carl algorithm. The utility of the procedure in improving precision of cluster analysis by incorporating prior information is demonstrated in a simulation study and the analysis of the real world genomics data.
منابع مشابه
On Bayesian Mixture Credibility
We introduce a class of Bayesian infinite mixture models first introduced by Lo (1984) to determine the credibility premium for a non-homogeneous insurance portfolio. The Bayesian infinite mixture models provide us with much flexibility in the specification of the claim distribution. We employ the sampling scheme based on a weighted Chinese restaurant process introduced in Lo et al. (1996) to e...
متن کاملInfinite models for speaker clustering
In this paper we propose the use of infinite models for the clustering of speakers. Speaker segmentation is obtained trough a Dirichlet Process Mixture (DPM) model which can be interpreted as a flexible model with an infinite a priori number of components. Learning is based on a Variational Bayesian approximation of the infinite sequence. DPM model is compared with fixed prior systems learned b...
متن کاملSupplemental Information Bayesian context-specific infinite mixture model for clustering of gene expression profiles across diverse microarray datasets
OUTLINE: 1. Additional ROC curves for the simulation study 2. Patterns of gene expression based on the joint analysis of cell cycle and sporulation data. 3. Patterns of gene expression based on the analysis of individual datasets (cell cycle and sporulation) separately. 4. Prior and posterior conditional probability distributions in the context-specific infinite mixture model. 5. Dynamic anneal...
متن کاملLocation Reparameterization and Default Priors for Statistical Analysis
This paper develops default priors for Bayesian analysis that reproduce familiar frequentist and Bayesian analyses for models that are exponential or location. For the vector parameter case there is an information adjustment that avoids the Bayesian marginalization paradoxes and properly targets the prior on the parameter of interest thus adjusting for any complicating nonlinearity the details ...
متن کاملBayesian non-parametric parsimonious clustering
This paper proposes a new Bayesian non-parametric approach for clustering. It relies on an infinite Gaussian mixture model with a Chinese Restaurant Process (CRP) prior, and an eigenvalue decomposition of the covariance matrix of each cluster. The CRP prior allows to control the model complexity in a principled way and to automatically learn the number of clusters. The covariance matrix decompo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010